---
title: "Launch Week #1, Day 5. Online evaluations"
date: "2024-12-06"
description: "Online evaluations are a way to monitor and assess LLM behavior in real-time"
author:
  name: Robert Kim
  url: https://x.com/skull8888888888
image: /blog/2024-12-06-online-evals.jpg
tags: ["online evaluations"]
---

At Laminar, we're excited to announce our newest feature: Online Evaluations. This feature allows engineering teams to run custom evaluators, either LLM-based or Python-based, on their LLM calls as they happen in production.

## What are Online Evaluations?
Online evaluations run automated checks and produce labels on your LLM calls as they happen in production. Instead of collecting data for post-hoc analysis, Laminar automatically evaluates each model call in real-time by analyzing the inputs and outputs of your LLM spans.

## Why We Built It
When you have thousands of LLM calls happening every day, it's hard to know if your LLMs are behaving as expected. Online evaluations allow you to monitor the quality of your LLMs in real-time, collect performance statistics, and detect issues before they impact users.

## How It Works
Laminar's online evaluations system is built around three core concepts:

### 1. Span Paths

Span paths uniquely identify where LLM calls happen in your code. They're automatically constructed from the location of the call, making it easy to track specific functions and endpoints.

### 2. Span Labels

Labels are values attached to spans that indicate evaluation results. 

### 3. Evaluators

Evaluators analyze inputs and outputs to generate labels. Laminar supports two types of evaluators:

- LLM-based evaluators
- Python-based evaluators


## Setting Up Evaluations
![Setting up evaluations](/blog/2024-12-06-online-evals-example.png)

Getting started with Laminar's online evaluations is straightforward:

1. Navigate to "Traces" in your Laminar dashboard
2. Select the span you want to evaluate
3. Click "Add Label" and create or choose a label class
4. Configure your evaluator:

    - Choose between Python code or LLM-based evaluation
    - Test your evaluator directly in the UI
5. Save and enable for production

Once enabled, Laminar will automatically run your evaluator on your LLM calls and attach labels to the spans. This label will be marked as `AUTO` in the dashboard.

![Evaluations in action](/blog/2024-12-06-online-evals-test-label.png)

## Best Practices

Start Simple

- Begin with basic format and content checks
- Add more sophisticated evaluations gradually
- Monitor evaluator performance impact


Layer Your Checks

- Technical validation (format, structure)
- Content validation (completeness, relevance)
- Quality metrics (coherence, accuracy)

Monitor Results

- Track evaluation trends over time
- Regularly review and refine criteria


## Conclusion
Online evaluations represent a significant step forward in LLM operations, bringing immediate quality feedback to production systems. With Laminar's implementation, teams can maintain high standards while gathering valuable insights about their models' behavior.
Try out online evaluations today and let us know what you think! Check out our [documentation](https://docs.lmnr.ai/evaluations/online-evaluations) for detailed setup instructions and best practices.